Critical Data Compression

نویسنده

  • John Scoville
چکیده

A new approach to data compression is developed and applied to multimedia content. This method separates messages into components suitable for both lossless coding and ’lossy’ or statistical coding techniques, compressing complex objects by separately encoding signals and noise. This is demonstrated by compressing the most significant bits of data exactly, since they are typically redundant and compressible, and either fitting a maximally likely noise function to the residual bits or compressing them using lossy methods. Upon decompression, the significant bits are decoded and added to a noise function, whether sampled from a noise model or decompressed from a lossy code. This results in compressed data similar to the original. Signals may be separated from noisy bits by considering derivatives of complexity in a manner akin to Kolmogorov’s approach or by empirical testing. The critical point separating the two represents the level beyond which compression using exact methods becomes impractical. Since redundant signals are compressed and stored efficiently using lossless codes, while noise is incompressible and practically indistinguishable from similar noise, such a scheme can enable high levels of compression for a wide variety of data while retaining the statistical properties of the original. For many test images, a two-part image code using JPEG2000 for lossy compression and PAQ8l for lossless coding produces less mean-squared error than an equal length of JPEG2000. For highly regular images, the advantage of such a scheme can be tremendous. Computer-generated images typically compress better using this method than through direct lossy coding, as do many black and white photographs and most color photographs at sufficiently high quality levels. Examples applying the method to audio and video coding are also demonstrated. Since two-part codes are efficient for both periodic and chaotic data, concatenations of roughly similar objects may be encoded efficiently, which leads to improved inference. Such codes enable complexity-based inference in data for which lossless coding performs poorly, enabling a simple but powerful minimal-description based approach audio, visual, and abstract pattern recognition. Applications to artificial intelligence are demonstrated, showing that signals using an economical lossless code have a critical level of redundancy which leads to better description-based inference than signals which encode either insufficient data or too much detail. 1 ar X iv :1 11 2. 54 93 v1 [ cs .I T ] 2 3 D ec 2 01 1 1 Complexity and Entropy In contrast to information-losing or ’lossy’ data compression, the lossless compression of data, the central problem of information theory, was essentially opened and closed by Claude Shannon in a 1948 paper[13]. Shannon showed that the entropy formula (introduced earlier by Gibbs in the context of statistical mechanics) establishes a lower bound on the compression of data communicated across some channel no algorithm can produce a code whose average codeword length is less than the Shannon information entropy. If the probability of codeword symbol i is Pi: S = −k ∑ Pi logPi (1) This quantity is the amount of information needed to invoke the axiom of choice and sample an element from a distribution or set with measure; any linear measure of choice must have its analytic form of expected log-probability[13]. This relies on the knowledge of a probability distribution over the possible codewords. Without a detailed knowledge of the process producing the data, or enough data to build a histogram, the entropy may not be easy to estimate. In many practical cases, entropy is most readily measured by using a general-purpose data compression algorithm whose output length tends toward the entropy, such as Lempel-Ziv. When the distribution is uniform, the Shannon/Gibbs entropy reduces to the Boltzmann entropy function of classical thermodynamics; this is simply the logarithm of the number of states. The entropy limit for data compression established by Shannon applies to the exact (’lossless’) compression of any type of data. As such, Shannon entropy corresponds more directly to written language, where each symbol is presumably equally important, than to raw numerical data, where leading digits typically have more weight than trailing digits. In general, an infinite number of trailing decimal points must be truncated from a real number in order to obtain a finite, rational measurement. Since some bits have much higher value than others, numerical data is naturally amenable to information-losing (’lossy’) data compression techniques, and such algorithms have become routine in the digital communication of multimedia data. For the case of a finite-precision numerical datum, rather than the Shannon entropy, a more applicable complexity measure might be Chaitin’s algorithmic prefix complexity[2] which measures the irreducible complexity of the leading digits from an infinite series of bits. The algorithmic prefix complexity is an example of a Kolmogorov complexity[9], the measure of minimal descriptive complexity playing a central role in Kolmogorov’s formalization of probability theory. Prior to the twentieth century, this basic notion of a probability distribution function (pdf) had not changed significantly since the time of Gauss. After analysis of the Brownian motion by Einstein and others, building on the earlier work of Markov, the stochastic process became a popular idea. Stochastic processes represent the fundamental, often microscopic, actions which lead to frequencies tending, in the limit, to a probability density. Stochastic partial differential equations (for example, the Fokker-Planck equation) generate a pdf as

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compression the Effect of Clinical Concept Mapping & Nursing Process in Developing Nursing Students’ Critical Thinking Skills

Introduction: Development of critical thinking and clinical education has remained a serious and considerable challenge throughout the nursing educational system in Iran. Education experts believe that effective teaching methods such as concept mapping and nursing process are practical strategies for critical development. Thus, this study was carried out to compare the effectiveness of clinical...

متن کامل

Elastic Buckling Analysis of Composite Shells with Elliptical Cross-section under Axial Compression

In the present research, the elastic buckling of composite cross-ply elliptical cylindrical shells under axial compression is studied through finite element approach. The formulation is based on shear deformation theory and the serendipity quadrilateral eight-node element is used to study the elastic behavior of elliptical cylindrical shells. The strain-displacement relations are accurately acc...

متن کامل

First-Order Formulation for Functionally Graded Stiffened Cylindrical Shells Under Axial Compression

The buckling analysis of stiffened cylindrical shells by rings and stringers made of functionally graded materials subjected to axial compression loading is presented. It is assumed that the material properties vary as a power form of the thickness coordinate variable. The fundamental relations, the equilibrium and stability equations are derived using the first order shear deformation theory. ...

متن کامل

Buckling of nanotubes under compression considering surface effects

In this paper, the modified Euler-Bernoulli beam model is presented to examine the influence of surface elasticity and residual surface tension on the critical force of axial buckling of nanotubes in the presence of rotary inertia. An explicit solution is derived for the buckling loads of microscaled Euler beams considering surface effects. The size-dependent buckling behavior of the nanotube d...

متن کامل

Buckling of nanotubes under compression considering surface effects

In this paper, the modified Euler-Bernoulli beam model is presented to examine the influence of surface elasticity and residual surface tension on the critical force of axial buckling of nanotubes in the presence of rotary inertia. An explicit solution is derived for the buckling loads of microscaled Euler beams considering surface effects. The size-dependent buckling behavior of the nanotube d...

متن کامل

Medical Image Compression Based on Region of Interest

Medical images show a great interest since it is needed in various medical applications. In order to decrease the size of medical images which are needed to be transmitted in a faster way; Region of Interest (ROI) and hybrid lossless compression techniques are applied on medical images to be compressed without losing important data. In this paper, a proposed model will be presented and assessed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1112.5493  شماره 

صفحات  -

تاریخ انتشار 2011